Exhibit A 

Structural Domain Analysis of SEQ ID NO: 2 



INTERPRO 

InterPro is a database of protein families, domains and functional sites in 
which identifiable features found in known proteins can be applied to unknown 



protein 


sequences . 


ht tp : / /www . ebi . ac . uk/ interpro/ 




noIPR 
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Sugar 


Family 


PS00217- 


c=> = — SUGAR_TRANSPORT_2 


transporter 






superfamily 


IPR005828 
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Pfam 

Pfam is a large collection of multiple sequence alignments and hidden Markov models covering many common protein 
domains and families. For each family in Pfam you can: Look at multiple alignments; View protein domain architectures; 
Examine species distribution; Follow links to other databases; View known protein structures. 
http://www.sanger.ac.uk/Software/Pfam/index.shtml 

Model Seq-from Seq-to HMM-from HMM-to Score E-value Alignment Description 



J! sugar tr 23 429 1 487 -119.5 0.0007 glocal Sugar (and other) transport 



sugar_tr: domain 1 of 1, from 23 to 429: score -119.5, E = 0.0007 

*->valvaalgGgf If GyDtgviggf lalidf If rf glltssgalaslvg 
+ a++G++l G + +++ +++i + +++ a 
sequence 2 3 CQAWTGTLLLGTCLLYCARSSMPICTVSMSQDFGWNKKEA 62 

ystvltglwsif f lGrliGslf aGklgdrf GRkksllial . . . .vlfvi 
g+v s+ff G + +++G+lgdr+G k +1++++ + + ++ 
sequence 63 GIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSAsawgSITAV 106 



Gal 1 sgaapgy t TiGlwaf y 1 1 ivGRvlvGlgvGgasvlvPmYi sEiAPk 
11 +++ +++ + + R+l+Gl G+ + + ++s+ + 

sequence 107 TPLLAHLSS AHLAFMTFSRILMGLLQGVYFPALTSLLSQKVRE 149 

alRGalgslyqlaitiGilvAaiiglglnktnndsalnswgWRiplglql 
+ R++ s+ + ++G 1++ +g 1 + ++ W + +++ 



sequence 150 SERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYG- 



-WQSIFYFSG 191 



vpalllligllf lPESPRwLvekgkleeArevLaklrgvedvdqeiqeek 
+ + 1++ + + + R+L++ ++1 A- vLa+.+r + 



sequence 192 GLT LLWVWYVY RYLLSEKDLILALGVLAQSR- 



sequence 



sequence 



sequence 



sequence 



sequence 



- — ---P 223 



" aeleatvseekagkaswgelfrgrtpkyrqrllmgvmlqafqQltGiNai 
++ + v+ w+ lfr + + + +v+ q+ + + 

224 VSRHSRVP- ^ ----- -WRRLFRKPA- - - ^VWAAWSQLSA- ACSFFIL 258 

fYYsptifksvGvsdsvasllvtiivgvvNfvfTfvaLif lvDr . ..... 

4. pt + f + + + + + + +v + + + + + +fl+D+ + + + 

259 LSWLPTFFEETFPDAK--GWIFNWPWLVAIPASLFS-GFLSDHlinqgy '305 

. . fGRRplll . IGaagmaicf lilgasvivallllnkpkdpsskaagiva 
+ + + R+1 + ++G+ + + + + 1 lg • + +++ + a 
306 raITVRKLMQgMGLGLSSVFALCLG---HTSSFCESV----------VFA 342 

ivf illf iafFalgwGpipwvilsElFPtkvRskalalataanwlanf ii 
. .+ i 1 + ++ g+ v + ++1 p ++ + +++a, a+* la++ + - 

343 SASIGLQTFNHS-GIS----VNIQDLAP-SCAGFLFGVANTAGALAGWG 386 

gf lfpyitgaiglalggyvflvfagllvlfilfvf ffvPETkGrtLEeie 
1 y++++ g + f + + a++ 1+ + f +v G ++ 
387 VC LGGYLMETTG - - S WTC LFNLVAI I SNLGL - - C TFLVF G - - - QAQR 426 



sequence 



elf<-* 
+++ 
427 VDL 



429 



ProtComp 



http : / /www . hgmp . mr c . ac . uk/GenomeWeb/prot-anal . html ^ b 

ProtComp Version 5. Identifying sub-cellular location ( Animal s&Fungi) 

Seq name: sequence 43 6 

Significant similarity in Location DB - Location: Plasma membrane 

Database sequence: AC=Q9BYTl Location : Plasma membrane DE BA3 05P22 .2.1 (Novel 
protein, isoform 1) . 

Seore=21855, Sequence length=430, Alignment length=422 - 
Predicted by Neural Nets - Plasma membrane with score 2.9 

******** Transmembrane segments are found: . +166 : 179+ . . +275 : 295- . . -399 : 412- . 
******** Potential GPI -anchor in position 414 is found 



7.8 



Location weights: 
Nuclear 

Plasma membrane 

Extracellular 
, Cytoplasmic 
Mitochondrial 
Endoplasm. retic . 
Peroxisomal 
Lysosomal 
Golgi 
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sequence 
sequence 
sequence 
sequence' 
sequence 



TMHMM2 . 0 
TMHMM2 . 0 
TMHMM2 . 0 
TMHMM2 . 0 
TMHMM2 . 0 



outside 

TMhelix 

inside 

TMhelix 

outside 



336 370 

371 393 

394 3 99 

400 422 

423 43 6 



TMHMM posterior probabilities for sequence 



,1.2 




transmembrane — — inside 



ProDom 

ProDom is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL 
sequence databases. Nucl. Acids. Res. Corpet et al. 28 (1): 267. 
http://prodes.toulouse.inra.fr/prodom/2002. 1/html/home.php 

HSP Results 

Warning: Original output has been filtered to yield non-redundant similarities 
BLASTP 2.2.1 [Apr-13-2001] 

Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search 
programs", Nucleic Acids.Res. 25:3389-3402. 

Query= 

(436 letters) 

Database: ProDom 2002.1 Jan2003 multiple alignments 

1,619,602 sequences; 167,025,341 total letters 



Searching 



done 



/ 



ProDom .domains producing High-scoring Segment Pairs: 
Position ProDom domain 



Score ' E value. 



13- 
' 38- 
38- 
38- 
45- 
48- 
55- 
. 55^ 
67- 
87- 
125- 
133- 
2 07- 
255- 

2 56- 
27 9 
327 

3 47 



86 
104 
104 
,107 
95 
108 
92 - 
■108 
•108 
-132 
-191 
-192 
-255 
-318 
-325 
-346 
-371 
-427 



#PD004810 

#PD003131 

#PD523332 

#PD53 5883 

#PD413 016 

#PD543895 

#PD063885 

#PD000036 

#PD00 0082 , 

#PD513011 

#PD078712 

#PD000916 

#PD434467 

#PD413016 

#PD0 01152 

#PD3 9.43 80 

#PD286146 

#PD508204 



> PD0 01152 (Closest domain: Q 9 BYT 1 __HUMAN 250-319). 
Number of domains in family: . 
Commentary (automatic) : 

■ TRANSPORTER INORGANIC RENAL SODIUM NA-DEPENDENT 

Length = 70 
Score- = 369 (146 bits) , .Expect = 4e-36 
Identities = 70/70 (100%), Positives = 70/70 (100%) 



32 5 
82 
86 
107 
89 
87 
89 T 
95 
109 
165 
. 87 
310 
244 
113 
3 69 
183 
91 
311 



4e-31 

0 . 007 

0 . 002 

8e-06 

0 . 001 " 

0 .002 

0 .001 

2e-04 

5e-06 

2e-12 

0 . 002' 

2e-2 9 

le-21 

2e-06 

4e-36 

le-14 

6e-04 

2e-29 



Query: 
Sbjct : 
Query: 
Sbjct : 



2 56 F I LL SWLPTF F EETF PDAKGWI FNWPWLVAI P ASLF SGFL SDHL INQGYRAI TVRKLMQ 315 

■ FILLSWLPTFFEETFPDAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVRKLMQ 
2 50 FILLSWLPTFFEETFPDAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVRKLMQ 3 09 



316 GMGLGLSSVF 32 5 

GMGLGLSSVF 
310 GMGLGLSSVF 319 



>PD004810 (Closest domain: Q8VCL5_MOUSE ' 11-89) ' 
Number of domains in family: • 
Commentary (automatic):. • 

GLYCOPROTEIN CHROMOSOME NA-DEPENDENT III SYMPORT SODIUM ■ 

: ^ . Length =79 
Score = 325 (129 bits), Expect = 4e-31 • 
Identities = 57/74 (77%) , Positives = 61/74 (82%) . 

Query* • ' 13 AGDTQWSRPECQAWXXXXXXXXXXXYCARSSMPICTVSMSQDFGWNKKEAGIVLSSFFWG 72 
• , A DT+WSRPECQAW YCAR +MP+CTV+MSQDFGWNKKEAGIVLSSFFWG 

Sbjct: 16 AEDTRWSRPECQAWTGILLLGTCLLYCARVTMPVCTVAMSQDFGWNKKEAGIVLSSFFWG 75 

: Query: 7 3 YCLTQWGGHLGDR 8 6 

YCLTQWGGHLGDR 
Sbjct : 7 6 YCLTQWGGHLGDR 8 9 



> PD5Q82 04 (Closest domain: Q 9 BYT 1 _HUMAN 341-421) 
Number of domains in family: 
Commentary* (automatic): 

NA- DEPENDENT BAC NOVEL SIMILAR THALIANA ARABIDOPSIS 

Length =81 
Score = 311 (124 bits), Expect = 2e-29 
■ Identities = 62/81 (76%), Positives =. 62/81 (76%) 

Query 347 GLQTFNHSGISVNIQDLAPSCAGFLFXXXXXXXXXXXXXXXXXXXYLMETTGSWTCLFNL 406 

GLQTFNHSGISVNIQDLAPSCAGFLF YLMETTGSWTCLFNL 
Sbjct: 341 GLQTFNHSGISVNIQDLAPSCAGFLFGVANTAGALAGWGVCLGGYLMETTGSWTCLFNL 400 

' Query: 407 VAI I SNLGLCTFLVFGQAQRV 427 
VAI I SNLGLCTFLVFGQAQRV 
Sbjct: 401 VAI I SNLGLCTFLVFGQAQRV 421 



> PD000916 (Closest domain: Q 9 BYT 1 _HUMAN 127-199) 
Number of domains in family: 
Commentary (automatic) : 

RESISTANCE MEMBRANE PROBABLE 'MULTIDRUG FAMILY 

Length =73 ' ; • 

Score = 310 (124 bits), Expect = 2e-29 
Identities = 60/60 (100%), Positives = 60/60 (100%) 

Query 133 GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 192 

GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 
Sbjct: 127 GVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLLEWYGWQSIFYFSGG 186 



> PD434467 (Closest domain: Q 9 BYT 1 _HUMAN 200-249) 
Number of domains in family: 1 

Commentary (automatic) : • ' 

Length =50 
Score = 244 (98.6 bits), Expect = le-21 
Identities = 48/49 (97%), Positives = 49/49 (99%) 

Query: 2 07 S E KDL I L ALGVL AQ S R P VS RH S RV PWRRL F RKP AVWAAWS QL S AAC S F 2 55 

SEKDLILALGVLAQSRPVSRH+RVPWRRLFRKPAVWAAWSQLSAACSF 
Sbjct: 201 SEKDLILALGVLAQSRPVSRHNRVPWRRLFRKPAVWAAWSQLSAACSF 249 



>PD394380 (Closest. domain: Q9DA66_MOUSE 1-99) 
Number of domains in family: 1 
Commentary (automatic) : 

Length =99 
Score = 183 (75.1 bits), Expect = le-14 
Identities = 41/74 (55%), Positives = 54/74 (72%) , Gaps = 7/74 (9%) . 

Query 27 9 NWPWLVAI PASLFSGFLSDHLIN - - -QGYRAITVRKLMQGMGLGLSSVFALCLGHT 332 

N + + p ++ + L S L+ HL+ QGYR ITVRK MQ MGLGL S S + F ALCLGHT 

Sbjct: 27 NLLPWLCL-LLLHSTLLAAHLLQGDLPQLQGYRVITVRKFMQVMGLGLSSIFALCLGHT 85 

Query: 3 33 SSFCESWFASASI 34 6 

+SF ++++FASASI 
Sbjct: 86 TSFLKAMIFASASI 99 



> PD513011 (Closest domain: Q9 BYT 1_HUMAN 81-126) 
Number of domains in family: 1 
Commentary (automatic) : 

Length = 46 
Score = 165 (68.2 bits), Expect = 2e-12 
Identities = 35/46 (76%), Positives = 35/46 (76%) 

Query: 87 IGGEKVILLSASAWGSITAVTPXXXXXXXXXXXFMTFSRILMGLLQ 132 

IGGEKVILLSASAWGSITAVTP FMTFSRILMGLLQ 
Sbjct: 81 IGGEKVILLSASAWGSITAVTPLLAHLSSAHLAFMTFSRILMGLLQ 12 6 



>PD413016 (Closest domain: Q8W4P5_ARATH 352-432) 
Number of domains in family: 895 
Commentary (automatic) : 

MULTIDRUG PROBABLE EFFLUX PERMEASE 
Length =81 
Score =.113 (48.1 bits), Expect = 2e-06 
Identities = 24/67 (35%), Positives = 35/67 (51%), Gaps = 4/67 (5%) 

Query: 255 FFILLSWLPTFFEETFP DAKGWIFNWPWLVAIPASLFSGFLSDHLINQGYRAITVR 311 

FF++LSW+P +F + W F+ VPW + +GF SD LI +G R 

Sbjct: 353 FFVILSWMPIYFNSVYHVl^TLKQAAW-FSAVPWSMMAFTGYIAGFWSDLLIRRGTSITLTR 411 

Query: 312 KLMQGMG 318 

K+MQ +G 
Sbjct: 412 KIMQSIG 418 



> PD000082 (Closest domain: Q 9 SH 8 2 _ARATH 142-197) 
Number of domains in family: 
Commentary (automatic) : 

RESISTANCE MEMBRANE PROBABLE FAMILY MULTIDRUG 
Length = 56 
Score = 109 (46.6 bits), Expect = 5e-06 
Identities = 19/42 (45%), Positives = 26/42 (61%) 

Query: 67 SSFFWGYCLTQWGGHLGDRIGGEKVILLSASAWGSITAVTP 108 

SSF WGY + V+GG L DR GG++V+ + W T +TP 

Sbjct: 142 SSFLWGYIFSSVIGGALVDRYGGKRVLiAWGVALWSLATLLTP 183 



> PD535883 (Closest domain: Q 8 YJH 9_BRUME 1-144) 
Number of domains in family: 1 
Commentary (automatic) : 

Length = 144 
Score = 107 (45.8 bits), Expect = 8e-06 
Identities = 21/70 (30%), Positives = 42/70 (60%), Gaps = 1/70 (1%) 

Query: 38 YCARSSMPICTVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSA 97 

Y R'++ + ■+ ++G+N+ + . G +L F +GY ++GG L D++G K+ + + + 

Sbjct: 49 YIDRGAISYASEQIIGEYGFNRADWGSMLGYFGYGYMFGAILGGTLSDKLGARKLWIIAG 108 



Query : 
Sbjct : 



98 SAWGSITAVT 107 
+AW SI AV+ 
109 TAW-SIVAVS 117 



> PD000036 (Closest domain: Q9V905_DROME 63-130) 
Number of domains in family: 
Commentary (automatic) : 

SODIUM-DEPENDENT CARRIER SODIUM- PHOSPHATE SODIUM FAMILY 
Length = 68 
Score = 95 (41.2 bits), Expect = 2e-04 
Identities = 16/54 (29%), Positives = 30/54 (54%) 

Query: 55 FGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSASAWGSITAVTP 108 

+ W + + ++L ++F+GY +T+GL++G V S G +TA+TP 

Sbjct: 63 YNWTQSDQALLLGAYFYGYMITSLPAGTLAEMLGARNVAGYSCLVAGILTALTP 116 



> PD286146 (Closest domain: Q9SH82_ARATH 407-561) 
Number of domains in family: 1 
Commentary (automatic) : 

Length = 155 
Score = 91 (39.7 bits) , . Expect = 6e-04 
Identities = 18/45 (40%), Positives = 28/45 (62%) 

Query: 3 27 LCLGHTSSFCESWFASASIGLQTFNHSGISVNIQDLAPSCAGFL 371 

LCL S + VF + + + L +F+ +G +N+QD+AP AGFL 

Sbjct: 411 LCLNFAKSPSCAAVFMTIALSLSSFSQAGFLLNMQDIAPQYAGFL 455 



>PD413016 (Closest domain: Q9 9TA7__STAAM 17-104) 
Number of domains in family: 895 
Commentary (automatic) : 

MULTIDRUG PROBABLE EFFLUX PERMEASE 
Length =88 
Score = 89 (38.9 bits), Expect = 0.001 
Identities = 19/51 (37%), Positives = 31/51 (60%) 

Query: 45 PICTVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILL 95 

P+ T+ M Q+ G + AG+VL +G + ++GG L D++GG K IL+ 

Sbjct: 2 6 PLNTIYMKQELGKSLTVAGLVLMINSFGMVIGNLLGGSLFDKLGGYKTILI 7 6 



> PD063885 (Closest domain: Q9V763_DROME 1-161) 
Number of domains in family : 2 
Commentary (automatic) : 

COTRANS PORTER 
Length = 161 
Score = 89 (38.9 bits), Expect = 0.001 
Identities = 16/38 (42%), Positives = 23/38 (60%) 

Query: 55 FGWNKKEAGI VL S S F FWG YC LTQWGGHLGDR I GGEKV 92 

F WN+K+ G +L SFFW + Q+ GG L + G + V 
Sbjct: 83 FHWNEKQQGALLG S F FWAHWTLQ I PGG I LATKYGTKLV 12 0 



> PD078712 (Closest domain: Q23063_CAEEL 5-202) 
Number of domains in family: 3 
Commentary (automatic) : 



Length = 198 
Score = 87 (38.1 bits), Expect = 0.002 
Identities - 22/69 (31%), Positives = 34/69 (48%), Gaps 



2/69 (2%) 



Ouerv 125 RILMGLLQGVYFPALTSLLSQKVRESERAFTYSIVGAGSQFGTLLTGAVGSLLL--EWYG 182 

■ y * R G q L+ + . ESE +F + SI + A SQFG L T +G + ++G 

Sbjct • 118 RFFAGFAQASQLHFTNDLVLRWTPESEASFFFSIMLATSQFGPLFTMILGGEMCSSSFFG 177 



Query: 183 WQSIFYFSG 191 

W+ + +Y G 
Sbjct: 178 WEATYYILG 186 



> PD543895 (Closest domain: Q8ZR98_SALTY 217-32 5) 
Number of domains in family: 8 

Commentary (automatic) : • / " ' 

TRANSMEMBRANE MEMBRANE ANTIBIOTIC ! / 

Length = 109 

Score = 87 (38.1 bits), Expect =0.002 . . . 
Identities = 22/61 (36%), Positives = 32/61 052%), Gaps = 6/61 (9%) 

Ouerv 48 TVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSASAWGSITAVT 107 

W - V+GG +GD+IG + VI WGSI V 



Sbjct : 
Query : 



T + Q FG + + A + L +F + 
251 TFYLMQKFGLSIQNAQLHLFAFLFAVAAGTVIGGPVGDKIGRKYVI- WGSILGVA 3 04 



108 P 108 

P . 

Sbjct: 3 05 P 3 05 



>PD523332 (Closest domain: Q8ZK69_SALTY 1-107) 
Number of domains in family: 10 - 
Commentary (automatic): 

PERMEASE PROBABLE 2 -KETO GLUCONATE INTEGRAL 

Length =107 
Score = 86 (37 .7 bits), Expect =0,002 
Identities = 18/67 (26%), Positives =32/67 (46%) 



Query : 
Sbjct : 
Query: 
Sbjct :. 



38 YCARSSMPICTVSMSQDFGWNKKEAGIVLSSFFWGYCLTQWGGHLGDRIGGEKVILLSA 97 

Y RS++ + +++ D + A IVL+ F GY + + GG R ' +K+++L 
22 YLDRSNLSVTLPTITHDLNIDGATASIVLTIFLIGYAFSNIFGGVFTQRYDPKKIVILMV 81 



98 SAWGSIT 104 

W T 
82 LIWSIAT 88 



>PD003131 (Closest domain: Q9RPP3_BURCE 19-13 0 ) . ;^ 

Number of domains in family: 

Commentary (automatic) : ' ' 

, PLASMID PROBABLE 4-HYDROXYPHENYLACETATE MFS PHTHALATE., 

Length = 112 

Score = 82 (36.2 bits), Expect = 0.007 . 
identities = 20/67 (29%);, Positives =; 31/67 (45%) 

QUery 38 YC ARS SMP I CTVSMS QDFGWNKKEAG I VXi S S FFWGYC LTQWGGHLGDR I GGEKVI LLS A 97 

" Y R. ++ + .+ D G + G+ +S FF GY L +V L RIG K + 

Sbjct- 45 YLDRVNVSFAQLQLKHDLGLSDAAYGLGVSLFFIGYILLEVPSTLLLRRIGARKTVTRIM 104 




Query: 98 SAWGSIT 104 

' '* \ WG+I + 

Sbjct : 105 LLWGAIS 111 




- :-■ *'J 



Parameters : : , 

Database: ProDom 2 0 02.1 Jan2 003 multiple alignments 

Number of letters in database: 167, 025;, 341 * ' 
Number of " sequences in database: - 1, 619, 602 



-..fit. ■ 



. v ... , y „ 



Lambda : ^ 
0.325 



K • H 
0 . 138 0 .441 



Gapped 
Lambda 
0.267 



K H 

0 .0410 - 0 . 140 
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